Search CORE

6 research outputs found

On Shapley Value in Data Assemblage Under Independent Utility

Author: Cong Zicun
Luo Xuan
Pei Jian
Xu Cheng
Publication venue: 'VLDB Endowment'
Publication date: 01/08/2022
Field of study

In many applications, an organization may want to acquire data from many data owners. Data marketplaces allow data owners to produce data assemblage needed by data buyers through coalition. To encourage coalitions to produce data, it is critical to allocate revenue to data owners in a fair manner according to their contributions. Although in literature Shapley fairness and alternatives have been well explored to facilitate revenue allocation in data assemblage, computing exact Shapley value for many data owners and large assembled data sets through coalition remains challenging due to the combinatoric nature of Shapley value. In this paper, we explore the decomposability of utility in data assemblage by formulating the independent utility assumption. We argue that independent utility enjoys many applications. Moreover, we identify interesting properties of independent utility and develop fast computation techniques for exact Shapley value under independent utility. Our experimental results on a series of benchmark data sets show that our new approach not only guarantees the exactness of Shapley value, but also achieves faster computation by orders of magnitudes.Comment: Accepted by VLDB 202

arXiv.org e-Print Archive

Mining Identification Rules for Classifying Mobile Application Traffic

Author: Cong Zicun
Publication venue
Publication date: 05/08/2016
Field of study

Classifying mobile application traffics is important in many network management tasks. Existing works rely on human expertise and reverse engineering to build classification rules. The huge number of mobile applications make it ineffective and even infeasible to do reverse engineering on every mobile application. In this thesis, we design a novel structure of app identification rules. Two algorithms are developed to mine the rules from HTTP header fields without any other external input. In addition, we also explore the function and effects of different HTTP header fields in the identification task. An extensive empirical study on real data verifies the effectiveness of our algorithms

Simon Fraser University Institutional Repository

Data pricing in machine learning pipelines

Author: CONG Zicun
LUO Xuan
PEI Jian
ZHANG Yong
ZHU Feida
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2022
Field of study

Institutional Knowledge at Singapore Management University

Data pricing and data asset governance in the AI Era

Author: CONG Zicun
HUIWEN Liu
MU Xin
PEI Jian
XUAN Luo
ZHU Feida
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/08/2021
Field of study

Institutional Knowledge at Singapore Management University